EN6106 – Emerging Topics in Information Technology
Comprehensive Study Notes (Lecture + References + Past Papers)
1. Microservices Architecture
1.1 Definition and Core Principles
Definition: Microservices architecture decomposes a monolithic application into independent, loosely coupled services, each with its own process, deployment, and scaling capabilities.
Core Principles: Single Responsibility, Decentralized Governance, Technology Diversity, Resilience, Scalability.
1.2 Monolithic vs. Microservices
| Aspect |
Monolithic |
Microservices |
| Deployment |
Single unit |
Independent services |
| Scalability |
Difficult |
Easy |
| Fault Isolation |
Poor |
High |
| Development |
Slower |
Faster |
1.3 Design Principles
- Single Responsibility Principle: Each service should have a single responsibility.
- Right-sized Services: Avoid over-fragmentation; aim for services that are neither too large nor too small.
- Decentralized Governance: Teams have autonomy over their services.
1.4 Communication Protocols
- Synchronous: REST, Thrift.
- Asynchronous: AMQP, STOMP, MQTT.
1.5 Data Management
Each microservice owns its database; data is accessed only via service APIs.
1.6 Security
OAuth2 and OpenID Connect for authentication and authorization.
1.7 Deployment
Docker: Containerization. Kubernetes: Orchestration.
Example: Netflix’s migration to microservices for scalability and reliability.
1.8 Challenges
- Distributed system complexity.
- Data consistency across services.
- Service discovery and inter-service communication.
1.9 References
Ref 1: Spring Microservices
2. Monolithic Architecture
2.1 Key Features
- Single codebase.
- Tight coupling between components.
- Centralized logging and monitoring.
- Simplicity of having one codebase.
- Speedy development and deployment.
2.2 When to Use
- Small, simple applications.
- Rapid development and deployment.
- Small development team.
2.3 Limitations
- Difficult to scale.
- Hard to upgrade and add new features.
- Practices agile development and delivery methodologies less effectively.
3. Data Science with Python
3.1 Key Libraries
- Pandas: Data manipulation and analysis.
- NumPy: Numerical computing.
- Matplotlib: Data visualization.
- Seaborn: Statistical data visualization.
- SciPy: Scientific computing.
- Scikit-learn: Machine learning.
3.2 Data Science Process
- Data Collection: Gather data from various sources (databases, APIs, files).
- Data Cleaning: Handle missing values, remove duplicates, correct errors.
- Data Exploration: Understand data through summary statistics and visualizations.
- Data Modeling: Build predictive models using machine learning algorithms.
- Data Interpretation: Analyze model outputs and insights.
- Data Visualization: Present insights using charts, graphs, and dashboards.
- Communication: Share findings with stakeholders.
3.3 Data Pipelines
ETL Process: Extract, Transform, Load.
Tools: Apache Hadoop, Spark, Informatica PowerCenter, Apache Kafka.
3.4 Requirements of Data Pipelines
- Extract data from multiple relevant data sources.
- Clean, alter, and enrich data so it can be ready for analysis.
- Load the data to a single source of information, usually a data lake or a data warehouse.
Example: Using Pandas for data cleaning and Matplotlib for visualization.
3.5 References
Ref 2: W3Schools Data Science Tutorial
4. Artificial Intelligence (AI)
4.1 Types of AI
- Weak AI/Narrow AI: Task-specific (e.g., chatbots, recommendation systems).
- Strong AI/General AI: Human-like reasoning (hypothetical).
- Artificial Superintelligence (ASI): Exceeds human intelligence (hypothetical).
4.2 AI Paradigms
- Turing Paradigm: Machines must convince humans they are not machines.
- Connectionist Paradigm: Mimics human brain structure (neural networks).
- Evolutionary Paradigm: Uses genetic algorithms.
- Bayesian Paradigm: Probabilistic reasoning.
- Fuzzy Logic: Handles uncertainty.
4.3 Subfields
- Machine Learning: Algorithms that improve with data.
- Deep Learning: Neural networks with multiple layers.
- Natural Language Processing (NLP): Language understanding and generation.
- Computer Vision: Image and video analysis.
4.4 Applications
Software-based:
- NLP (Google Translate, ChatGPT).
- Computer Vision (Google Photos, FaceID).
- Speech Recognition (Siri, Alexa).
Hardware-based:
- Robots, Autonomous Vehicles, Drones.
- AI Chips (Google TPUs, NVIDIA GPUs).
- Smart Home Devices (Nest, Alexa).
Example: Tesla Autopilot for autonomous vehicles.
4.5 References
Ref 3: NetLogo Multi-Agent Modeling
5. Social Network Analysis (SNA)
5.1 Graph Theory Basics
- Undirected Graphs: No directionality.
- Directed Graphs: Directional relationships.
- Centrality Measures:
- Degree Centrality: Number of connections.
- Betweenness Centrality: Bridge between groups.
- Eigenvector Centrality: Influence of connected nodes.
- Closeness Centrality: Distance to all other nodes.
5.2 Data Management
Tools: Gephi, UCINET, Pajek.
Best practices: Standardization, cleaning, documentation.
5.3 Applications
- Identifying key influencers.
- Detecting communities/clusters.
- Analyzing information flow.
- Cybersecurity (gang activity, fraud detection).
5.4 References
Ref 4: NetworkX Reference
6. Digital Forensics
6.1 Key Concepts
Branches:
- Computer Forensics
- Mobile Forensics
- Network Forensics
- Database Forensics
Digital Evidence Types:
- Emails, chat logs, internet browser histories, metadata, deleted files, contents of computer memory.
6.2 Forensic Process
- Identification: Gather information and identify potential evidence sources.
- Collection: Secure and preserve evidence (forensic imaging).
- Examination: Analyze data for relevance.
- Analysis: Interpret evidence and draw conclusions.
- Presentation: Report findings for legal proceedings.
6.3 Chain of Custody
Purpose: Document handling and movement of evidence.
Key Points:
- Document date, time, description of evidence, and handler’s name.
- Prevents tampering, contamination, or loss of evidence.
- Starts at collection and ends at court presentation.
6.4 Roles in Law Enforcement
- Cybercrime investigations.
- Fraud investigations.
- Child exploitation cases.
- Terrorism investigations.
Example: Using Autopsy for forensic analysis.
6.5 References
Ref 5: Python Digital Forensics Tutorial
7. Data Visualization
7.1 Benefits
- Easier identification of trends and patterns.
- Improved decision-making.
- Easy way to share information for non-technical audiences.
- Visualize patterns and relationships.
7.2 Tools
Matplotlib, Seaborn, Plotly, Tableau.
Example: Using Seaborn for statistical data visualization.
8. Key Exam Insights
Commonly Tested Topics:
| Topic |
Key Points |
| Microservices |
Decentralized data management, independent deployment, communication protocols. |
| Monolithic Architecture |
Single codebase, simplicity, difficulty in scaling. |
| Data Science with Python |
Pandas, NumPy, ETL process, data visualization. |
| AI |
Weak vs. Strong AI, AI paradigms, subfields (ML, NLP, CV). |
| SNA |
Graph theory, centrality measures, data management tools. |
| Digital Forensics |
Evidence types, forensic process, chain of custody. |
| Data Visualization |
Benefits, tools, and applications. |
Exam Tips
- Focus on decentralized data management in microservices.
- Understand the difference between monolithic and microservices architectures.
- Know the steps of the data science process and key Python libraries.
- Be familiar with AI paradigms and their applications.
- Practice graph theory and centrality measures for SNA.
- Review the forensic process and chain of custody for digital forensics.